Scalable imputation of genetic data with a discrete fragmentation-coagulation process

نویسندگان

  • Lloyd T. Elliott
  • Yee Whye Teh
چکیده

We present a Bayesian nonparametric model for genetic sequence data in which a set of genetic sequences is modelled using a Markov model of partitions. The partitions at consecutive locations in the genome are related by the splitting and merging of their clusters. Our model can be thought of as a discrete analogue of the continuous fragmentation-coagulation process [Teh et al 2011], preserving the important properties of projectivity, exchangeability and reversibility, while being more scalable. We apply this model to the problem of genotype imputation, showing improved computational efficiency while maintaining accuracies comparable to other state-of-the-art genotype imputation methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling Genetic Variations with Fragmentation-Coagulation Processes

We propose a novel class of Bayesian nonparametric models for sequential data called fragmentation-coagulation processes (FCPs). FCPs model a set of sequences using a partition-valued Markov process which evolves by splitting and merging clusters. An FCP is exchangeable, projective, stationary and reversible, and its equilibrium distributions are given by the Chinese restaurant process. As oppo...

متن کامل

Bayesian Nonparametric Modelling of Genetic Variations using Fragmentation-Coagulation Processes

We propose a novel class of Bayesian nonparametric models for variations in genetic data called fragmentation-coagulation processes (FCPs). FCPs model a set of sequences using a partition-valued Markov process which evolves by splitting and merging clusters. FCPs have a number of theoretically appealing properties: they are infinitely exchangeable, stationary and reversible, with equilibrium di...

متن کامل

Modelling Genetic Variations using Fragmentation-Coagulation Processes

We propose a novel class of Bayesian nonparametric models for sequential data called fragmentation-coagulation processes (FCPs). FCPs model a set of sequences using a partition-valued Markov process which evolves by splitting and merging clusters. An FCP is exchangeable, projective, stationary and reversible, and its equilibrium distributions are given by the Chinese restaurant process. As oppo...

متن کامل

Discrete element modeling of explosion-induced fracture extension in jointed rock masses

The explosion process of explosives in a borehole applies a very high pressure on its surrounding rock media. This process can initiate and propagate rock fractures, and finally, may result in the rock fragmentation. Rock fragmentation is mainly caused by the propagation of inherent pre-existing fractures of the rock mass and also from the extension of the newly formed cracks within the intact ...

متن کامل

Experimental investigation, modeling, and optimization of combined electro-(fenton/coagulation/flotation) process: design of experiments and artificial intelligence systems

In this study, a combined electro-(Fenton/coagulation/flotation) (EF/EC/El) process was studied via degradation of Disperse Orange 25 (DO25) organic dye as a case study. Influences of seven operational parameters on the dye removal efficiency (DR%) were measured: initial pH of the solution (pH0), applied voltage between the anode and cathode (V), initial ferrous ion concentration (CFe), initial...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012